DDG Task Recovery for Cluster Computing
نویسندگان
چکیده
This paper presents a solution for the problem of transparent recovery of asynchronous distributed computation on clusters of workstations when a fault occurs on a node. If the system has fault-tolerant features, it can survive the fault and continues its computations. Performance degradation is unavoidable when hardware redundancies are not available. It is a large advantage if the long-runtime application can restart from a checkpoint instead of restarting whole computation. This paper presents the fault-tolerant feature of the DDG environment oriented to cluster systems without hardware spare.
منابع مشابه
Parallel Programming Environment for Cluster Computing
In this paper, we present a new model for parallel program development for cluster computing called Data Driven Graph (DDG). DDG automatically analyzes data dependence among tasks, synchronizes data, generates task graphs and schedules. Programming in DDG is easy and reliable; most of work is done automatically by DDG, what does not only minimize the amount of work done by programmers but also ...
متن کاملRecovery and characterization of α-zein from corn fermentation coproducts.
Zeins were isolated from corn ethanol coproduct distiller's dried grains (DDG) and fractionated into α- and β γ-rich fractions. The effects of the ethanol production process, such as fermentation type, protease addition, and DDG drying temperature on zein recovery, were evaluated. Yield, purity, and molecular properties of recovered zein fractions were determined and compared with zein isolated...
متن کاملAn Effective Task Scheduling Framework for Cloud Computing using NSGA-II
Cloud computing is a model for convenient on-demand user’s access to changeable and configurable computing resources such as networks, servers, storage, applications, and services with minimal management of resources and service provider interaction. Task scheduling is regarded as a fundamental issue in cloud computing which aims at distributing the load on the different resources of a distribu...
متن کاملParallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...
متن کاملGreen Energy-aware task scheduling using the DVFS technique in Cloud Computing
Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...
متن کامل